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1 SEQUENCE LISTING 

2 

3 

4 (1) GENERAL INFORMATION: 
5 

6 (i) APPLICANT: CAPUT, DANIEL 

7 FERRARA, PASCUAL 

8 GUILLEMOT, JEAN-CLAUDE 

9 KAGHAD, MOURAD 

10 LEGOUX, RICHARD 

11 LOISON, GERARD 

12 LARBRE, ELIZABETH 

13 LUPKER, JOHANNES 

14 LEPLATOIS, PASCUAL 

15 SALOME, MARK 
16 

17 (ii) TITLE OF INVENTION: URATE OXIDASE ACTIVITY PROTEIN, 

18 RECOMBINANT GENE CODING THEREFOR, EXPRESSION VECTOR, 

19 MICRO-ORGANISMS AND TRANSFORMED CELLS 
20 

21 (iii) NUMBER OF SEQUENCES: 36 
22 

23 (iv) CORRESPONDENCE ADDRESS: 

24 (A) ADDRESSEE: Foley & Lardner 

25 (B) STREET: 1800 Diagonal Road, Suite 500 

26 (C) CITY: Alexandria 

27 (D) STATE: Virginia 

28 (E) COUNTRY: USA 

29 (F) ZIP: 22313-0299 
30 

31 (v) COMPUTER READABLE FORM: 

32 (A) MEDIUM TYPE: Floppy disk 

33 (B) COMPUTER: IBM PC compatible 

34 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

35 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 
36 

37 (vi) CURRENT APPLICATION DATA: 

38 (A) APPLICATION NUMBER: US 07/659,408 

39 (B) FILING DATE: 25-APR-1991 

40 (C) CLASSIFICATION: 
41 

42 (viii) ATTORNEY/ AGENT INFORMATION: 

43 (A) NAME: BENT, Stephen A. 

44 (B) REGISTRATION NUMBER: 29,768 

45 (C) REFERENCE/DOCKET NUMBER: 16781/276 BEDL 
46 

47 (ix) TELECOMMUNICATION INFORMATION: 

48 (A) TELEPHONE: (703)836-9300 

49 (B) TELEFAX: (703)683-4109 

50 (C) TELEX: 899149 
51 

52 

53 (2) INFORMATION FOR SEQ ID NO:l: 



® 
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54 

55 (i) SEQUENCE CHARACTERISTICS: 

56 (A) LENGTH: 301 amino acids 

57 (B) TYPE: amino acid 

58 (D) TOPOLOGY: linear 
59 

60 (ii) MOLECULE TYPE: protein 

61 

62 (iii) HYPOTHETICAL: NO 

63 

64 (vi) ORIGINAL SOURCE: 

65 (A) ORGANISM: Aspergillus flavus 
66 

67 (vii) IMMEDIATE SOURCE: 

68 (B) CLONE: Urate oxidase 
69 

70 

71 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

72 

73 Ser Ala Val Lys Ala Ala Arg Tyr Gly Lys Asp Asn Val Arg Val Tyr 

74 1 5 10 15 
75 

76 Lys Val His Lys Asp Glu Lys Thr Gly Val Gin Thr Val Tyr Glu Met 

77 20 25 30 
78 

79 Thr Val Cys Val Leu Leu Glu Gly Glu lie Glu Thr Ser Tyr Thr Lys 

80 35 40 45 
81 

82 Ala Asp Asn Ser Val lie Val Ala Thr Asp Ser lie Lys Asn Thr lie 

83 50 55 60 
84 

85 Tyr lie Thr Ala Lys Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly 

86 65 70 75 80 
87 

88 Ser lie Leu Gly Thr His Phe lie Glu Lys Tyr Asn His lie His Ala 

89 85 90 95 
90 

91 Ala His Val Asn lie Val Cys His Arg Trp Thr Arg Met Asp lie Asp 

92 100 105 110 
93 

94 Gly Lys Pro His Pro His Ser Phe lie Arg Asp Ser Glu Glu Lys Arg 

95 115 120 * 125 
96 

97 Asn Val Gin Val Asp Val Val Glu Gly Lys Gly lie Asp lie Lys Ser 

98 130 135 140 
99 

100 Ser Leu Ser Gly Leu Thr Val Leu Lys Ser Thr Asn Ser Gin Phe Trp 

101 145 150 155 160 
102 

103 Gly Phe Leu Arg Asp Glu Tyr Thr Thr Leu Lys Glu Thr Trp Asp Arg 

104 165 170 ~ * 175 
105 

106 lie Leu Ser Thr Asp Val Asp Ala Thr Trp Gin Trp Lys Asn Phe Ser 
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107 180 185 190 

108 

109 Gly Leu Gin Glu Val Arg Ser His Val Pro Lys Phe Asp Ala Thr Trp 

110 195 200 205 
111 

112 Ala Thr Ala Arg Glu Val Thr Leu Lys Thr Phe Ala Glu Asp Asn Ser 

113 210 215 220 
114 

115 Ala Ser Val Gin Ala Thr Met Tyr Lys Met Ala Glu Gin lie Leu Ala 

116 225 230 235 240 
117 

118 Arg Gin Gin Leu lie Glu Thr Val Glu Tyr Ser Leu Pro Asn Lys His 

119 245 250 255 
120 

121 Tyr Phe Glu lie Asp Leu Ser Trp His Lys Gly Leu Gin Asn Thr Gly 

122 260 265 270 
123 

124 Lys Asn Ala Glu Val Phe Ala Pro Gin Ser Asp Pro Asn Gly Leu lie 

125 275 280 285 
126 

127 Lys Cys Thr Val Gly Arg Ser Ser Leu Lys Ser Lys Leu 

128 290 295 300 
129 

130 (2) INFORMATION FOR SEQ ID NO: 2: 
131 

132 (i) SEQUENCE CHARACTERISTICS: 

133 (A) LENGTH: 302 amino acids 

134 (B) TYPE: amino acid 

135 (C) STRANDEDNESS : single 

136 (D) TOPOLOGY: linear 
137 

138 (ii) MOLECULE TYPE: protein 

139 

140 (iii) HYPOTHETICAL: NO 

141 

142 (vi) ORIGINAL SOURCE: 

143 (A) ORGANISM: Aspergillus flavus 
144 

145 (vii) IMMEDIATE SOURCE: 

146 (B) CLONE: Met-Urate oxidase 
147 

148 

149 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

150 

151 Met Ser Ala Val Lys Ala Ala Arg Tyr Gly Lys Asp Asn Val Arg Val 

152 1 5 10 " 15 
153 

154 Tyr Lys Val His Lys Asp Glu Lys Thr Gly Val Gin Thr Val Tyr Glu 

155 20 25 30 
156 

157 Met Thr Val Cys Val Leu Leu Glu Gly Glu lie Glu Thr Ser Tyr Thr 

158 35 40 45 
159 
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160 Lys Ala Asp Asn Ser Val He Val Ala Thr Asp Ser He Lys Asn Thr 

161 50 * 55 60 
162 

163 He Tyr lie Thr Ala Lys Gin Asn Pro Val Thr Pro Pro Glu Leu Phe 

164 65 70 75 80 
165 

166 Gly Ser He Leu Gly Thr His Phe He Glu Lys Tyr Asn His He His 

167 85 90 95 
168 

169 Ala Ala His Val Asn He Val Cys His Arg Trp Thr Arg Met Asp He 

170 100 105 * 110 
171 

172 Asp Gly Lys Pro His Pro His Ser Phe He Arg Asp Ser Glu Glu Lys 

173 115 120 " 125 
174 

175 Arg Asn Val Gin Val Asp Val Val Glu Gly Lys Gly He Asp He Lys 

176 130 135 140 
177 

178 Ser Ser Leu Ser Gly Leu Thr Val Leu Lys Ser Thr Asn Ser Gin Phe 

179 145 150 155 160 
180 

181 Trp Gly Phe Leu Arg Asp Glu Tyr Thr Thr Leu Lys Glu Thr Trp Asp 

182 165 170 175 
183 

184 Arg He Leu Ser Thr Asp Val Asp Ala Thr Trp Gin Trp Lys Asn Phe 

185 180 185 190 
186 

187 Ser Gly Leu Gin Glu Val Arg Ser His Val Pro Lys Phe Asp Ala Thr 

188 195 200 205 
189 

190 Trp Ala Thr Ala Arg Glu Val Thr Leu Lys Thr Phe Ala Glu Asp Asn 

191 210 215 220 
192 

193 Ser Ala Ser Val Gin Ala Thr Met Tyr Lys Met Ala Glu Gin He Leu 

194 225 230 235 240 
195 

196 Ala Arg Gin Gin Leu He Glu Thr Val Glu Tyr Ser Leu Pro Asn Lys 

197 245 250 255 
198 

199 His Tyr Phe Glu He Asp Leu Ser Trp His Lys Gly Leu Gin Asn Thr 

200 260 265 270 
201 

202 Gly Lys Asn Ala Glu Val Phe Ala Pro Gin Ser Asp Pro Asn Gly Leu 

203 275 280 285 
204 

205 He Lys Cys Thr Val Gly Arg Ser Ser Leu Lys Ser Lys Leu 

206 290 295 300 
207 

208 (2) INFORMATION FOR SEQ ID NO: 3: 
209 

210 (i) SEQUENCE CHARACTERISTICS: 

211 (A) LENGTH: 906 base pairs 

212 (B) TYPE: nucleic acid 
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213 (C) STRANDEDNESS : single 

214 (D) TOPOLOGY: linear 
215 

216 (ii) MOLECULE TYPE: DNA (genomic) 

217 

218 

219 (vii) IMMEDIATE SOURCE: 

220 (B) CLONE: Preferred sequence for expression in 

221 prokaryotes 
222 

223 

224 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

225 

226 ATGTCTGCGG TAAAAGCAGC GCGCTACGGC AAGGACAATG TTCGCGTCTA CAAGGTTCAC 60 
227 

228 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
229 

230 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
231 

232 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
233 

234 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
235 

236 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
237 

238 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
239 

240 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 
241 

242 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
243 

244 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
245 

246 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
247 

248 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
249 

250 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
251 

252 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
253 

254 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
255 

256 AAATTG 906 
257 

258 (2) INFORMATION FOR SEQ ID NO: 4: 
259 

260 (i) SEQUENCE CHARACTERISTICS: 

261 (A) LENGTH: 906 base pairs 

262 (B) TYPE: nucleic acid 

263 (C) STRANDEDNESS: single 

264 (D) TOPOLOGY: linear 
265 
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266 (ii) MOLECULE TYPE: DNA (genomic) 

267 

268 

269 (vii) IMMEDIATE SOURCE: 

270 (B) CLONE: Preferred sequence for expression in 

271 eukaryotes 
272 

273 

274 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

275 

276 ATGTCTGCTG TTAAGGCTGC TAGATACGGT AAGGACAACG TTAGAGTCTA CAAGGTTCAC 60 
277 

278 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
279 

280 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
281 

282 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
283 

284 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
285 

286 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
287 

288 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
289 

290 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 
291 

292 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
293 

294 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
295 

296 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
297 

298 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
299 

300 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
301 

302 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
303 

304 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
305 

306 AAATTG 906 
307 

308 (2) INFORMATION FOR SEQ ID NO: 5: 
309 

310 (i) SEQUENCE CHARACTERISTICS: 

311 (A) LENGTH: 14 base pairs 

312 (B) TYPE: nucleic acid 

313 (C) STRANDEDNESS : single 

314 (D) TOPOLOGY: linear 
315 

316 (ii) MOLECULE TYPE: DNA (genomic) 

317 

318 (iii) HYPOTHETICAL: NO 
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319 
320 

321 (vii) IMMEDIATE SOURCE: 

322 (B) CLONE: Preferred non-translated 5' sequence for 

323 animal cells 
324 

325 

326 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

327 

328 AGCTTGCCGC CACT 14 
329 

330 (2) INFORMATION FOR SEQ ID NO: 6: 
331 

332 (i) SEQUENCE CHARACTERISTICS: 

333 (A) LENGTH: 906 base pairs 

334 (B) TYPE: nucleic acid 

335 (C) STRANDEDNESS : double 

336 (D) TOPOLOGY: linear 
337 

338 (ii) MOLECULE TYPE: DNA (genomic) 

339 

340 (iii) HYPOTHETICAL: NO 

341 

342 

343 (vii) IMMEDIATE SOURCE: 

344 (B) CLONE: Preferred sequence for expression in animal 

345 cells 
346 

347 

348 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

349 

350 ATGTCCGCAG TAAAAGCAGC CCGCTACGGC AAGGACAATG TCCGCGTCTA CAAGGTTCAC 60 
351 

352 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
353 

354 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
355 

356 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
357 

358 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
359 

360 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
361 

362 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
363 

364 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 
365 

366 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
367 

368 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
369 

370 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
371 
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372 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
373 

374 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
375 

376 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
377 

378 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
379 

380 AAATTG 906 
381 

382 (2) INFORMATION FOR SEQ ID NO: 7: 
383 

384 (i) SEQUENCE CHARACTERISTICS: 

385 (A) LENGTH: 23 base pairs 

386 (B) TYPE: nucleic acid 

387 (C) STRANDEDNESS : single 

388 (D) TOPOLOGY: linear 
389 

390 (ii) MOLECULE TYPE: DNA (genomic) 

391 

392 (iii) HYPOTHETICAL: NO 

393 

394 

395 (vii) IMMEDIATE SOURCE: 

396 (B) CLONE: reverse transcription primer 
397 

398 

399 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

400 

401 GATCCGGGCC CTTTTTTTTT TTT 23 
402 

403 (2) INFORMATION FOR SEQ ID NO: 8: 
404 

405 (i) SEQUENCE CHARACTERISTICS: 

406 (A) LENGTH: 10 amino acids 

407 (B) TYPE: amino acid 

408 (C) STRANDEDNESS: single 

409 (D) TOPOLOGY: linear 
410 

411 (ii) MOLECULE TYPE: peptide 

412 

413 (iii) HYPOTHETICAL: NO 

414 

415 

416 (vii) IMMEDIATE SOURCE: 

417 (B) CLONE: Hydrolysis product T 17 
418 

419 

420 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

421 

422 Asn Val Gin Val Asp Val Val Glu Gly Lys 

423 1 5 * 10 
424 
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425 (2) INFORMATION FOR SEQ ID NO:9: 
426 

427 (i) SEQUENCE CHARACTERISTICS: 

428 (A) LENGTH: 8 amino acids 

429 (B) TYPE: amino acid 

430 (C) STRANDEDNESS : single 

431 (D) TOPOLOGY: linear 
432 

433 (ii) MOLECULE TYPE: peptide 

434 

435 (iii) HYPOTHETICAL: NO 

436 

437 

438 (vii) IMMEDIATE SOURCE: 

439 (B) CLONE: Hydrolysis product T 20 
440 

441 

442 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

443 

444 Asn Phe Ser Gly Leu Gin Glu Val 

445 1 5 
446 

447 (2) INFORMATION FOR SEQ ID NO: 10: 
448 

449 (i) SEQUENCE CHARACTERISTICS: 

450 (A) LENGTH: 6 amino acids 

451 (B) TYPE: amino acid 

452 (C) STRANDEDNESS: single 

453 (D) TOPOLOGY: linear 
454 

455 (ii) MOLECULE TYPE: peptide 

456 

457 (iii) HYPOTHETICAL: NO 

458 

459 

460 (vii) IMMEDIATE SOURCE: 

461 (B) CLONE: Hydrolysis product T 23 
462 

463 

464 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

465 

466 Phe Asp Ala Thr Trp Ala 

467 1 5 
468 

469 (2) INFORMATION FOR SEQ ID NO:ll: 
470 

471 (i) SEQUENCE CHARACTERISTICS: 

472 (A) LENGTH: 8 amino acids 

473 (B) TYPE: amino acid 

474 (C) STRANDEDNESS: single 

475 (D) TOPOLOGY: linear 
476 

477 (ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL : NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 27 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

His Tyr Phe Glu lie Asp Leu Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 28 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

lie Leu Ser Thr Asp Val Asp Ala Thr Trp Gin Trp Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 29 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
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531 

532 His Tyr Phe Glu lie Asp Leu Ser Trp His Lys 

533 1 5 10 
534 

535 (2) INFORMATION FOR SEQ ID NO: 14: 
536 

537 (i) SEQUENCE CHARACTERISTICS: 



538 (A) LENGTH: 11 amino acids 

539 (B) TYPE: amino acid 

540 (C) STRANDEDNESS : single 

541 (D) TOPOLOGY: linear 
542 



543 (ii) MOLECULE TYPE: peptide 

544 

545 (iii) HYPOTHETICAL: NO 

546 

547 

548 (vii) IMMEDIATE SOURCE: 

549 (B) CLONE: Hydrolysis product T 31 
550 

551 

552 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

553 

554 Ser Thr Asn Ser Gin Phe Trp Gly Phe Leu Arg 

555 1 5 10 
556 

557 (2) INFORMATION FOR SEQ ID NO: 15: 
558 

559 (i) SEQUENCE CHARACTERISTICS: 



560 (A) LENGTH: 16 amino acids 

561 (B) TYPE: amino acid 

562 (C) STRANDEDNESS: single 

563 (D) TOPOLOGY: linear 
564 



565 (ii) MOLECULE TYPE: peptide 

566 

567 (iii) HYPOTHETICAL: NO 

568 

569 

570 (vii) IMMEDIATE SOURCE: 

571 (B) CLONE: Hydrolysis product T 32 
572 

573 

574 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

575 

576 Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly Ser lie Leu Gly Thr 

577 1 5 10 15 
578 

579 

580 (2) INFORMATION FOR SEQ ID NO: 16: 
581 

582 (i) SEQUENCE CHARACTERISTICS: 

583 (A) LENGTH: 16 amino acids 
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584 (B) TYPE: amino acid 

585 (C) STRANDEDNESS : single 

586 (D) TOPOLOGY: linear 
587 

588 (ii) MOLECULE TYPE: peptide 

589 

590 (iii) HYPOTHETICAL: NO 

591 

592 

593 (vii) IMMEDIATE SOURCE: 

594 (B) CLONE: Hydrolysis product T 33 
595 

596 

597 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

598 

599 Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly Ser lie Leu Gly Thr 

600 1 5 10 15 
601 

602 

603 (2) INFORMATION FOR SEQ ID NO: 17: 
604 

605 (i) SEQUENCE CHARACTERISTICS: 

606 (A) LENGTH: 17 amino acids 

607 (B) TYPE: amino acid 

608 (C) STRANDEDNESS: single 

609 (D) TOPOLOGY: linear 
610 

611 (ii) MOLECULE TYPE: peptide 

612 

613 (iii) HYPOTHETICAL: NO 

614 

615 

616 (vii) IMMEDIATE SOURCE: 

617 (B) CLONE: Hydrolysis product V 1 
618 

619 

620 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

621 

622 Tyr Ser Leu Pro Asn Lys His Tyr Phe Glu lie Asp Leu Ser Trp His 

623 1 5 10 15 
624 

625 Lys 

626 

627 

628 (2) INFORMATION FOR SEQ ID NO: 18: 
629 

630 (i) SEQUENCE CHARACTERISTICS: 

631 (A) LENGTH: 16 amino acids 

632 (B) TYPE: amino acid 

633 (C) STRANDEDNESS: single 

634 (D) TOPOLOGY: linear 
635 

636 (ii) MOLECULE TYPE: peptide 
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637 

638 (iii) HYPOTHETICAL: NO 

639 

640 

641 (vii) IMMEDIATE SOURCE: 

642 (B) CLONE: Hydrolysis product V 2 
643 

644 

645 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

646 

647 Val Thr Leu Lys Thr Phe Ala Glu Asp Asn Ser Ala Ser Val Gin Ala 

648 15 10 15 
649 

650 

651 (2) INFORMATION FOR SEQ ID NO: 19: 
652 

653 (i) SEQUENCE CHARACTERISTICS: 



654 (A) LENGTH: 24 amino acids 

655 (B) TYPE: amino acid 

656 (C) STRANDEDNESS : single 

657 (D) TOPOLOGY: linear 
658 



659 (ii) MOLECULE TYPE: peptide 

660 

661 (iii) HYPOTHETICAL: NO 

662 

663 

664 (vii) IMMEDIATE SOURCE: 

665 (B) CLONE: Hydrolysis product V 3 
666 

667 

668 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

669 

670 Thr Ser Tyr Thr Lys Ala Asp Asn Ser Val lie Val Asp Thr Asp Ser 

671 1 5 10 15 
672 

673 lie Lys Asn Thr lie Tyr lie Thr 

674 20 
675 

676 (2) INFORMATION FOR SEQ ID NO: 20: 
677 

678 (i) SEQUENCE CHARACTERISTICS: 



679 (A) LENGTH: 28 amino acids 

680 (B) TYPE: amino acid 

681 (C) STRANDEDNESS: single 

682 (D) TOPOLOGY: linear 
683 



684 (ii) MOLECULE TYPE: peptide 

685 

686 (iii) HYPOTHETICAL: NO 

687 

688 

689 (vii) IMMEDIATE SOURCE: 
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690 (B) CLONE: Hydrolysis product V 5 

691 

692 

693 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

694 

695 Gly Lys Gly lie Asp lie Lys Ser Ser Leu Ser Gly Leu Thr Val Leu 

696 15 10 15 
697 

698 Lys Ser Thr Asn Ser Gin Phe Trp Gly Phe Leu Arg 

699 ~ 20 25 
700 

701 (2) INFORMATION FOR SEQ ID NO: 21: 
702 

703 (i) SEQUENCE CHARACTERISTICS: 

704 (A) LENGTH: 17 amino acids 

705 (B) TYPE: amino acid 

706 (C) STRANDEDNESS : single 

707 (D) TOPOLOGY: linear 
708 

709 (ii) MOLECULE TYPE: peptide 

710 

711 (iii) HYPOTHETICAL: NO 

712 

713 

714 (vii) IMMEDIATE SOURCE: 

715 (B) CLONE: Hydolysis product V 6 
716 

717 

718 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

719 

720 Gly Lys Gly lie Asp He Lys Ser Ser Leu Ser Gly Leu Thr Val Leu 

721 1 5 10 15 
722 

723 Lys 

724 

725 

726 (2) INFORMATION FOR SEQ ID NO: 22: 
727 

728 (i) SEQUENCE CHARACTERISTICS: 

729 (A) LENGTH: 1236 base pairs 

730 (B) TYPE: nucleic acid 

731 (C) STRANDEDNESS: single 

732 (D) TOPOLOGY: linear 
733 

734 (ii) MOLECULE TYPE: DNA (genomic) 

735 

736 (iii) HYPOTHETICAL: NO 

737 

738 

739 (vii) IMMEDIATE SOURCE: 

740 (B) CLONE: Fragment 3 
741 

742 
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743 
744 
745 
746 
747 
748 
749 
750 
751 
752 
753 
754 
755 
756 
757 
758 
759 
760 
761 
762 
763 
764 
765 
766 
767 
768 
769 
770 
771 
772 
773 
774 
775 
776 
777 
778 
779 
780 
781 
782 
783 
784 
785 
786 
787 
788 
789 
790 
791 
792 
793 
794 
795 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

GATCCGCGGA AGCATAAAGT GTAAAGCCTG GGGTGCCTAA TGAGTGAGCT AACTTACATT 60 

AATTGCGTTG CGCTCACTGC CCGCTTTCCA GTCGGGAAAC CTGTCGTGCC AGCTGCATTA 120 

ATGAATCGGC CAACGCGCGG GGAGAGGCGG TTTGCGTATT GGGCGCCAGG GTGGTTTTTC 180 

TTTTCACCAG TGAGACGGGC AACAGCTGAT TGCCCTTCAC CGCCTGGCCC TGAGAGAGTT 240 

GCAGCAAGCG GTCCACGCTG GTTTGCCCCA CCACCCGAAA ATCCTGTTTG ATGGTGGTTA 300 

ACGGCGGGAT ATAACATGAG CTGTCTTCGG TATCGTCGTA TCCCACTACC GAGATATCCG 360 

CACCAACGCG CAGCCCGGAC TCGGTAATGG CGCGCATTGC GCCCAGCGCC ATCTGATCGT 420 

TGGCAACCAG CATCGCAGTG GGAACGATGC CCTCATTCAG CATTTGCATG GTTTGTTGAA 480 

AACCGGACAT GGCACTCCAG TCGCCTTCCC GTTCCGCTAT CGGCTGAATT TGATTGCGAG 540 

TGAGATATTT ATGCCAGCCA GCCAGACGCA GACGCGCCGA GACAGAACTT AATGGGCCCG 600 

CTAACAGCGC GATTTGCTGG TGACCCAATG CGACCAGATG CTCCACGCCC AGTCGCGTAC 660 

CGTCTTCATG GGAGAAAATA ATACTGTTGA TGGGTGTCTG GTCAGAGACA TCAAGAAATA 720 

ACGCCGGAAC ATTAGTGCAG GCAGCTTCCA CAGCAATGGC ATCCTGGTCA TCCAGCGGAT 780 

AGTTAATGAT CAGCCCACTG ACGCGTTGCG CGAGAAGATT GTGCACCGCC GCTTTACAGG 840 

CTTCGACGCC GCTTCGTTCT ACCATCGACA CCACCACGCT GGCACCCAGT TGATCGGCGC 900 

GAGATTTAAT CGCCGCGACA ATTTGCGACG GCGCGTGCAG GGCCAGACTG GAGGTGGCAA 960 

CGCCAATCAG CAACGACTGT TTGCCCGCCA GTTGTTGTGC CACGCGGTTG GGAATGTAAT 1020 

TCAGCTCCGC CATCGCCGCT TCCACTTTTT CCCGCGTTTT CGCAGAAACG TGGCTGGCCT 1080 

GGTTCACCAC GCGGGAAACG GTCTGATAAC AGACACCGGC ATACTCTGCG ACATCGTATA 1140 

ACGTTACTGG TTTCACATTC ACCACCCTGA ATTGACTCTC TTCCGGGCGC TATCATGCCA 1200 

TACCGCGAAA GGTTTTGCGC CATTCGATGG TGTCCG 1236 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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817 
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819 
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(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment 4 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 107.. 316 

(D) OTHER INFORMATION: /product= "regulatory signal + aa 
1-44 human growth hormone precursor" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



TCGAGCTGAC TGACCTGTTG CTTATATTAC ATCGATAGCG TATAATGTGT GGAATTGTGA 60 

GCGATAACAA TTTCACACAG TTTAACTTTA AGAAGGAGAT ATACAT ATG GCT ACC 115 

Met Ala Thr 
1 

GGA TCC CGG ACT AGT CTG CTC CTG GCT TTT GGC CTG CTC TGC CTG CCC 163 
Gly Ser Arg Thr Ser Leu Leu Leu Ala Phe Gly Leu Leu Cys Leu Pro 
5 10 15 

TGG CTT CAA GAG GGC AGT GCC TTC CCA ACC ATT CCC TTA TCT AGA CTT 211 
Trp Leu Gin Glu Gly Ser Ala Phe Pro Thr lie Pro Leu Ser Arg Leu 
20 25 30 35 

TTT GAC AAC GCT ATG CTC CGC GCC CAT CGT CTG CAC CAG CTG GCC TTT 259 
Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe 
40 45 50 

GAC ACC TAC CAG GAG TTT GAA GAA GCC TAT ATC CCA AAG GAA CAG AAG 307 
Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys 
55 60 65 

TAT TCA TTC CTGCA 321 
Tyr Ser Phe 
70 



(2) INFORMATION FOR SEQ ID NO: 24: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
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850 Met Ala Thr Gly Ser Arg Thr Ser Leu Leu Leu Ala Phe Gly Leu Leu 

851 15 10 15 
852 

853 Cys Leu Pro Trp Leu Gin Glu Gly Ser Ala Phe Pro Thr lie Pro Leu 

854 20 25 30 
855 

856 Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin 

857 35 40 45 
858 

859 Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys 

860 50 55 60 
861 

862 Glu Gin Lys Tyr Ser Phe 

863 65 70 
864 

865 (2) INFORMATION FOR SEQ ID NO: 25: 
866 

867 (i) SEQUENCE CHARACTERISTICS: 

868 (A) LENGTH: 74 base pairs 

869 (B) TYPE: nucleic acid 

870 (C) STRANDEDNESS : double 

871 (D) TOPOLOGY: linear 
872 

873 (ii) MOLECULE TYPE: DNA (genomic) 

874 

875 (iii) HYPOTHETICAL: NO 

876 

877 

878 (vii) IMMEDIATE SOURCE: 

879 (B) CLONE: Clal-Ndel fragment 
880 

881 

882 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

883 

884 CGATAGCGTA TAATGTGTGG AATTGTGAGC GGATAACAAT TTCACACAGT TTTTCGCGAA 
885 

886 GAAGGAGATA TACA 
887 

888 (2) INFORMATION FOR SEQ ID NO: 26: 
889 

890 (i) SEQUENCE CHARACTERISTICS: 

891 (A) LENGTH: 190 base pairs 

892 (B) TYPE: nucleic acid 

893 (C) STRANDEDNESS: double 

894 (D) TOPOLOGY: linear 
895 

896 (ii) MOLECULE TYPE: DNA (genomic) 

897 

898 (iii) HYPOTHETICAL: NO 

899 

900 

901 (vii) IMMEDIATE SOURCE: 
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(B) CLONE: Plasmid p373,2 fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GATCTTCAAG CAGACCTACA GCAAGTTCGA CACAAACTCA CACAACGATG ACGCACTACT 60 

CAAGAACTAC GGGCTGCTCT ACTGCTTCAG GAAGGACATG GACAAGGTCG AGACATTCCT 120 

GCGCATCGTG CAGTGCCGCT CTGTGGAGGG CAGCTGTGGC TTCTAGTAAG GTACCCTGCC 180 

CTACGTACCA 190 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Accl-Ndel synthetic fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TATGTCTGCG GTAAAAGCAG CGCGCTACGG CAAGGACAAT GTTCGCGT 48 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Plasmid pEMR469 fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
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GGGACGCGTC TCCTCTGCCG GAACACCGGG CATCTCCAAC TTATAAGTTG GAGAAATAAG 60 

AGAATTTCAG ATTGAGAGAA TGAAAAAAAA AAAAAAAAAA AAGGCAGAGG AGAGCATAGA 120 

AATGGGGTTC ACTTTTTGGT AAAGCTATAG CATGCCTATC ACATATAAAT AGAGTGCCAG 180 

TAGCGACTTT TTTCACACTC GAGATACTCT TACTACTGCT CTCTTGTTGT TTTTATCACT 240 

TCTTGTTTCT TCTTGGTAAA TAGAATATCA AGCTACAAAA AGCATACAAT CAACTATCAA 300 

CTATTAACTA TATCGATACC ATATGGATCC GTCGACTCTA GAGGATCGTC GACTCTAGAG 360 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment C 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGATATACAC AATGTCTGCT GTTAAGGCTG CTAGATACGG TAAGGACAAC GTTAGAGT 58 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1013 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment D 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
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1008 
1009 
1010 
1011 
1012 
1013 
1014 
1015 
1016 
1017 
1018 
1019 
1020 
1021 
1022 
1023 
1024 
1025 
1026 
1027 
1028 
1029 
1030 
1031 
1032 
1033 
1034 
1035 
1036 
1037 
1038 
1039 
1040 
1041 
1042 
1043 
1044 
1045 
1046 
1047 
1048 
1049 
1050 
1051 
1052 
1053 
1054 
1055 
1056 
1057 
1058 
1059 
1060 



CTACAAGGTT CACAAGGACG AGAAGACCGG TGTCCAGACG 
TGTGCTTCTG GAGGGTGAGA TTGAGACCTC TTACACCAAG 
CGCAACCGAC TCCATTAAGA ACACCATTTA CATCACCGCC 
TCCCGAGCTG TTCGGCTCCA TCCTGGGCAC ACACTTCATT 
TGCCGCTCAC GTCAACATTG TCTGCCACCG CTGGACCCGG 
ACACCCTCAC TCCTTCATCC GCGACAGCGA GGAGAAGCGG 
CGAGGGCAAG GGCATCGATA TCAAGTCGTC TCTGTCCGGC 
CAACTCGCAG TTCTGGGGCT TCCTGCGTGA CGAGTACACC 
CCGTATCCTG AGCACCGACG TCGATGCCAC TTGGCAGTGG 
GGAGGTCCGC TCGCACGTGC CTAAGTTCGA TGCTACCTGG 
TCTGAAGACT TTTGCTGAAG ATAACAGTGC CAGCGTGCAG 
AGAGCAAATC CTGGCGCGCC AGCAGCTGAT CGAGACTGTC 
GCACTATTTC GAAATCGACC TGAGCTGGCA CAAGGGCCTC 
CGAGGTCTTC GCTCCTCAGT CGGACCCCAA CGGTCTGATC 
CTCTCTGAAG TCTAAATTGT AAACCAACAT GATTCTCACG 
ACTGTATATA GTCTGGGATA GGGTATAGCA TTCATTCACT 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Synthetic GAL 7 fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 



GTGTACGAGA 
GCCGACAACA 
AAGCAGAACC 
GAGAAGTACA 
ATGGACATTG 
AATGTGCAGG 
CTGACCGTGC 
ACACTTAAGG 
AAGAATTTCA 
GCCACTGCTC 
GCCACTATGT 
GAGTACTCGT 
CAAAACACCG 
AAGTGTACCG 
TTCCGGAGTT 
TGTTTTTTAC 
AAAAAAGGGC 



TGACCGTCTG 
GCGTCATTGT 
CCGTTACTCC 
ACCACATCCA 
ACGGCAAGCC 
TGGACGTGGT 
TGAAGAGCAC 
AGACCTGGGA 
GTGGACTCCA 
GCGAGGTCAC 
ACAAGATGGC 
TGCCTAACAA 
GCAAGAACGC 
TCGGCCGGTC 
TCCAAGGCAA 
TTCCAAAAAA 
CCG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1013 
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1061 CGCGTCTATA CTTCGGAGCA CTGTTGAGCG AAGGCTCATT AGATATATTT TCTGTCATTT 60 
1062 

1063 TCCTTAACCC AAAAATAAGG GAGAGGGTCC AAAAAGCGCT CGGACAACTG TTGACCGTGA 120 
1064 

1065 TCCGAAGGAC TGGCTATACA GTGTTCACAA AATAGCCAAG CTGAAAATAA TGTGTAGCCT 180 
1066 

1067 TTAGCTATGT TCAGTTAGTT TGGCATG 207 
1068 

1069 (2) INFORMATION FOR SEQ ID NO: 32: 
1070 

1071 (i) SEQUENCE CHARACTERISTICS: 

1072 (A) LENGTH: 23 base pairs 

1073 (B) TYPE: nucleic acid 

1074 (C) STRANDEDNESS : single 

1075 (D) TOPOLOGY: linear 
1076 

1077 (ii) MOLECULE TYPE: DNA (genomic) 

1078 

1079 (iii) HYPOTHETICAL: NO 

1080 

1081 

1082 (vii) IMMEDIATE SOURCE: 

1083 (B) CLONE: Modified Xbal-Mlul adapter 
1084 

1085 

1086 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

1087 

1088 CTAGGCTAGC GGGCCCGCAT GCA 23 
1089 

1090 (2) INFORMATION FOR SEQ ID NO: 33: 
1091 

1092 (i) SEQUENCE CHARACTERISTICS: 

1093 (A) LENGTH: 422 base pairs 

1094 (B) TYPE: nucleic acid 

1095 (C) STRANDEDNESS: single 

1096 (D) TOPOLOGY: linear 
1097 

1098 (ii) MOLECULE TYPE: DNA (genomic) 

1099 

1100 (iii) HYPOTHETICAL: NO 

1101 

1102 

1103 (vii) IMMEDIATE SOURCE: 

1104 (B) CLONE: Plasmid pSEl "site binding to Hindlll" 

1105 fragment 
1106 

1107 

1108 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

1109 

1110 AGCTGGCTCG CATCTCTCCT TCACGCGCCC GCCGCCCTAC CTGAGGCCGC CATCCACGCC 60 
1111 

1112 GGTGAGTCGC GTTCTGCCGC CTCCCGCCTG TGGTGCCTCC TGAACTGCGT CCGCCGTCTA 120 
1113 
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GGTAGGCTCC AAGGGAGCCG GACAAAGGCC CGGTCTCGAC CTGAGCTCTA AACTTACCTA 180 

GACTCAGCCG GCTCTCCACG CTTTGCCTGA CCCTGCTTGC TCAACTCTAC GTCTTTGTTT 240 

CGTTTTCTGT TCTGCGCCGT TACAACTTCA AGGTATGCGC TGGGACCTGG CAGGCGGCAT 300 

CTGGGACCCC TAGGAAGGGC TTGGGGGTCC TCGTGCCCAA GGCAGGGAAC ATAGTGGTCC 360 

CAGGAAGGGG AGCAGAGGCA TCAGGGTGTC CACTTTGTCT CCGCAGCTCC TGAGCCTGCA 420 

GA 422 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Synthetic HindIII-"site binding to BamHI" 
fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



AGCTTGTCGA CTAATACGAC TCACTATAGG GCGGCCGCGG GCCCCTGCAG GAATTCGGAT 60 
CCCCCGGGTG ACTGACT 77 



(2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Synthetic Hindlll-AccI fragment 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
AGCTTGCCGC CACTATGTCC GCAGTAAAAG CAGCCCGCTA CGGCAAGGAC AATGTCCGCG 
T 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hindlll-SnaBI fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
AGCTTGCCGC CACTATGTCC GCAGTAAAAG CAGCCCGCTA CGGCAAGGAC AATGTCCGCG 
TCTACAAGGT TCACAAGGAC GAGAAGACCG GTGTCCAGAC GGTGTACGAG ATGACCGTCT 
GTGTGCTTCT GGAGGGTGAG ATTGAGACCT CTTACACCAA GGCCGACAAC AGCGTCATTG 
TCGCAACCGA CTCCATTAAG AACACCATTT ACATCACCGC CAAGCAGAAC CCCGTTACTC 
CTCCCGAGCT GTTCGGCTCC ATCCTGGGCA CACACTTCAT TGAGAAGTAC AACCACATCC 
ATGCCGCTCA CGTCAACATT GTCTGCCACC GCTGGACCCG GATGGACATT GACGGCAAGC 
CACACCCTCA CTCCTTCATC CGCGACAGCG AGGAGAAGCG GAATGTGCAG GTGGACGTGG 
TCGAGGGCAA GGGCATCGAT ATCAAGTCGT CTCTGTCCGG CCTGACCGTG CTGAAGAGCA 
CCAACTCGCA GTTCTGGGGC TTCCTGCGTG ACGAGTACAC CACACTTAAG GAGACCTGGG 
ACCGTATCCT GAGCACCGAC GTCGATGCCA CTTGGCAGTG GAAGAATTTC AGTGGACTCC 
AGGAGGTCCG CTCGCACGTG CCTAAGTTCG ATGCTACCTG GGCCACTGCT CGCGAGGTCA 
CTCTGAAGAC TTTTGCTGAA GATAACAGTG CCAGCGTGCA GGCCACTATG TACAAGATGG 
CAGAGCAAAT CCTGGCGCGC CAGCAGCTGA TCGAGACTGT CGAGTACTCG TTGCCTAACA 
AGCACTATTT CGAAATCGAC CTGAGCTGGC ACAAGGGCCT CCAAAACACC GGCAAGAACG 



60 
61 
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1220 CCGAGGTCTT CGCTCCTCAG TCGGACCCCA ACGGTCTGAT CAAGTGTACC GTCGGCCGGT 900 
1221 

1222 CCTCTCTGAA GTCTAAATTG 920 
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